智能论文笔记

Important Sentence Identification in Legal Cases Using Multi-Class Classification

Sahan Jayasinghe , Lakith Rambukkanage , Ashan Silva , Nisansa de Silva , Amal Shehan Perera

分类：自然语言处理 | 人工智能 | 机器学习

2021-11-10

自然语言处理的进步（NLP）正在通过实际应用和学术利益的形式传播各个域。本质上，法律域包含大量数据以文本格式。因此，它需要将NLP应用于迎合对域的分析要求苛刻的需求。识别法律案例中的重要句子，事实和论点是法律专业人员这么繁琐的任务。在本研究中，我们探讨了句子嵌入的使用，以确定法律案件中的重要句子，在案件中的主要缔约方的角度。此外，定义了特定于任务的丢失功能，以提高通过分类交叉熵损失的直接使用限制的准确性。

translated by 谷歌翻译

Deep Learning for Space Weather Prediction: Bridging the Gap between Heliophysics Data and Theory

John C. Dorelli , Chris Bard , Thomas Y. Chen , Daniel Da Silva , Luiz Fernando Guides dos Santos , Jack Ireland , Michael Kirk , Ryan McGranaghan , Ayris Narock , Teresa Nieves-Chinchilla

分类：机器学习

2022-12-27

Traditionally, data analysis and theory have been viewed as separate disciplines, each feeding into fundamentally different types of models. Modern deep learning technology is beginning to unify these two disciplines and will produce a new class of predictively powerful space weather models that combine the physical insights gained by data and theory. We call on NASA to invest in the research and infrastructure necessary for the heliophysics' community to take advantage of these advances.

translated by 谷歌翻译

Stochastic Nonlinear Ensemble Modeling and Control for Robot Team Environmental Monitoring

Victoria Edwards , Thales C. Silva , M. Ani Hsieh

分类：机器人

2022-12-22

We seek methods to model, control, and analyze robot teams performing environmental monitoring tasks. During environmental monitoring, the goal is to have teams of robots collect various data throughout a fixed region for extended periods of time. Standard bottom-up task assignment methods do not scale as the number of robots and task locations increases and require computationally expensive replanning. Alternatively, top-down methods have been used to combat computational complexity, but most have been limited to the analysis of methods which focus on transition times between tasks. In this work, we study a class of nonlinear macroscopic models which we use to control a time-varying distribution of robots performing different tasks throughout an environment. Our proposed ensemble model and control maintains desired time-varying populations of robots by leveraging naturally occurring interactions between robots performing tasks. We validate our approach at multiple fidelity levels including experimental results, suggesting the effectiveness of our approach to perform environmental monitoring.

translated by 谷歌翻译

Can a Robot Shoot an Olympic Recurve Bow? A preliminary study

Guilherme Christmann , Lin Yu-Ren , Rodrigo da Silva Guerra , Jacky Baltes

分类：机器人

2022-12-21

The field of robotics, and more especially humanoid robotics, has several established competitions with research oriented goals in mind. Challenging the robots in a handful of tasks, these competitions provide a way to gauge the state of the art in robotic design, as well as an indicator for how far we are from reaching human performance. The most notable competitions are RoboCup, which has the long-term goal of competing against a real human team in 2050, and the FIRA HuroCup league, in which humanoid robots have to perform tasks based on actual Olympic events. Having robots compete against humans under the same rules is a challenging goal, and, we believe that it is in the sport of archery that humanoid robots have the most potential to achieve it in the near future. In this work, we perform a first step in this direction. We present a humanoid robot that is capable of gripping, drawing and shooting a recurve bow at a target 10 meters away with considerable accuracy. Additionally, we show that it is also capable of shooting distances of over 50 meters.

translated by 谷歌翻译

Extractive Text Summarization Using Generalized Additive Models with Interactions for Sentence Selection

Vinícius Camargo da Silva , João Paulo Papa , Kelton Augusto Pontara da Costa

分类：自然语言处理 | 机器学习

2022-12-21

Automatic Text Summarization (ATS) is becoming relevant with the growth of textual data; however, with the popularization of public large-scale datasets, some recent machine learning approaches have focused on dense models and architectures that, despite producing notable results, usually turn out in models difficult to interpret. Given the challenge behind interpretable learning-based text summarization and the importance it may have for evolving the current state of the ATS field, this work studies the application of two modern Generalized Additive Models with interactions, namely Explainable Boosting Machine and GAMI-Net, to the extractive summarization problem based on linguistic features and binary classification.

translated by 谷歌翻译

Proportional Control for Stochastic Regulation on Allocation of Multi-Robots

Thales C. Silva , Victoria Edwards , M. Ani Hsieh

分类：机器人

2022-12-19

Any strategy used to distribute a robot ensemble over a set of sequential tasks is subject to inaccuracy due to robot-level uncertainties and environmental influences on the robots' behavior. We approach the problem of inaccuracy during task allocation by modeling and controlling the overall ensemble behavior. Our model represents the allocation problem as a stochastic jump process and we regulate the mean and variance of such a process. The main contributions of this paper are: Establishing a structure for the transition rates of the equivalent stochastic jump process and formally showing that this approach leads to decoupled parameters that allow us to adjust the first- and second-order moments of the ensemble distribution over tasks, which gives the flexibility to decrease the variance in the desired final distribution. This allows us to directly shape the impact of uncertainties on the group allocation over tasks. We introduce a detailed procedure to design the gains to achieve the desired mean and show how the additional parameters impact the covariance matrix, which is directly associated with the degree of task allocation precision. Our simulation and experimental results illustrate the successful control of several robot ensembles during task allocation.

translated by 谷歌翻译

Receding Horizon Control on the Broadcast of Information in Stochastic Networks

Thales C. Silva , Li Shen , Xi Yu , M. Ani Hsieh

分类：机器人

2022-12-19

This paper focuses on the broadcast of information on robot networks with stochastic network interconnection topologies. Problematic communication networks are almost unavoidable in areas where we wish to deploy multi-robotic systems, usually due to a lack of environmental consistency, accessibility, and structure. We tackle this problem by modeling the broadcast of information in a multi-robot communication network as a stochastic process with random arrival times, which can be produced by irregular robot movements, wireless attenuation, and other environmental factors. Using this model, we provide and analyze a receding horizon control strategy to control the statistics of the information broadcast. The resulting strategy compels the robots to re-direct their communication resources to different neighbors according to the current propagation process to fulfill global broadcast requirements. Based on this method, we provide an approach to compute the expected time to broadcast the message to all nodes. Numerical examples are provided to illustrate the results.

translated by 谷歌翻译

Synthesis and Evaluation of a Domain-specific Large Data Set for Dungeons & Dragons

Akila Peiris , Nisansa de Silva

分类：自然语言处理 | 机器学习

2022-12-18

This paper introduces the Forgotten Realms Wiki (FRW) data set and domain specific natural language generation using FRW along with related analyses. Forgotten Realms is the de-facto default setting of the popular open ended tabletop fantasy role playing game, Dungeons & Dragons. The data set was extracted from the Forgotten Realms Fandom wiki consisting of more than over 45,200 articles. The FRW data set is constituted of 11 sub-data sets in a number of formats: raw plain text, plain text annotated by article title, directed link graphs, wiki info-boxes annotated by the wiki article title, Poincar\'e embedding of first link graph, multiple Word2Vec and Doc2Vec models of the corpus. This is the first data set of this size for the Dungeons & Dragons domain. We then present a pairwise similarity comparison benchmark which utilizes similarity measures. In addition, we perform D&D domain specific natural language generation using the corpus and evaluate the named entity classification with respect to the lore of Forgotten Realms.

translated by 谷歌翻译

Clinical Deterioration Prediction in Brazilian Hospitals Based on Artificial Neural Networks and Tree Decision Models

Hamed Yazdanpanah , Augusto C. M. Silva , Murilo Guedes , Hugo M. P. Morales , Leandro dos S. Coelho , Fernando G. Moro

分类：机器学习

2022-12-17

Early recognition of clinical deterioration (CD) has vital importance in patients' survival from exacerbation or death. Electronic health records (EHRs) data have been widely employed in Early Warning Scores (EWS) to measure CD risk in hospitalized patients. Recently, EHRs data have been utilized in Machine Learning (ML) models to predict mortality and CD. The ML models have shown superior performance in CD prediction compared to EWS. Since EHRs data are structured and tabular, conventional ML models are generally applied to them, and less effort is put into evaluating the artificial neural network's performance on EHRs data. Thus, in this article, an extremely boosted neural network (XBNet) is used to predict CD, and its performance is compared to eXtreme Gradient Boosting (XGBoost) and random forest (RF) models. For this purpose, 103,105 samples from thirteen Brazilian hospitals are used to generate the models. Moreover, the principal component analysis (PCA) is employed to verify whether it can improve the adopted models' performance. The performance of ML models and Modified Early Warning Score (MEWS), an EWS candidate, are evaluated in CD prediction regarding the accuracy, precision, recall, F1-score, and geometric mean (G-mean) metrics in a 10-fold cross-validation approach. According to the experiments, the XGBoost model obtained the best results in predicting CD among Brazilian hospitals' data.

translated by 谷歌翻译

Towards fully automated deep-learning-based brain tumor segmentation: is brain extraction still necessary?

Bruno Machado Pacheco , Guilherme de Souza e Cassia , Danilo Silva

分类：计算机视觉

2022-12-14

State-of-the-art brain tumor segmentation is based on deep learning models applied to multi-modal MRIs. Currently, these models are trained on images after a preprocessing stage that involves registration, interpolation, brain extraction (BE, also known as skull-stripping) and manual correction by an expert. However, for clinical practice, this last step is tedious and time-consuming and, therefore, not always feasible, resulting in skull-stripping faults that can negatively impact the tumor segmentation quality. Still, the extent of this impact has never been measured for any of the many different BE methods available. In this work, we propose an automatic brain tumor segmentation pipeline and evaluate its performance with multiple BE methods. Our experiments show that the choice of a BE method can compromise up to 15.7% of the tumor segmentation performance. Moreover, we propose training and testing tumor segmentation models on non-skull-stripped images, effectively discarding the BE step from the pipeline. Our results show that this approach leads to a competitive performance at a fraction of the time. We conclude that, in contrast to the current paradigm, training tumor segmentation models on non-skull-stripped images can be the best option when high performance in clinical practice is desired.

translated by 谷歌翻译